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Abstract 

We show how Fisher's information already known particular character as the fundamental infor- 
mation geometric object which plays the role of a metric tensor for a statistical differential manifold, 
can be derived in a relatively easy manner through the direct application of a generalized logarithm 
and exponential formalism to generalized information-entropy measures. We shall first shortly de- 
scribe how the generalization of information-entropy measures naturally comes into being if this 
formalism is employed and recall how the relation between all the information measures is best 
understood when described in terms of a particular logarithmic Kolmogorov-Nagumo average. Sub- 
sequently, extending Kullback-Leibler's relative entropy to all these measures defined on a manifold 
of parametrized probability density functions, we obtain the metric which turns out to be the Fisher 
information matrix elements times a real multiplicative deformation parameter. The metrics inde- 
pendence from the non-extensive character of the system, and its proportionality to the rate of change 
of the multiplicity under a variation of the statistical probability parameter space, emerges naturally 
in the frame of this representation. 

Keywords: Generalized information entropy measures, Fisher information, Tsallis, Renyi, Sharma- 
Mittal, entropy, Information geometry 

PACS: 05.70.a, 65.40.Gr, 89.70. +c, 05.70.Ln, 05.70.a 



* Corresponding author: marcQ.masi2@tin.it, fax: -f39 02 6596997 



1 



1 Introduction 



In a previous paper (jlSl ), using a formalism based on Kolmogorov-Nagumo means and generalized log- 
arithms and exponentials, we wrote down the set of entropy functionals, from Boltzamann-Gibbs en- 
tropy through Rcnyi and Tsallis, up to Sharma-Mittal (25) and a new entropy measure, we called the 
" supra-extensive entropy" , so that the increasing generalization of entropy measures from arithmetic to 
non-arithmetic means, and from extensive to non-extensive systems became particularly compact and 
visible in its hierarchical structure. Sharma-Mittal measure was already developed in 1975 but has been 
investigated in generalized thermostatistics only recently by Frank, Daffertshofer and Naudts (|T3). (13) 
(|20h). We showed that Sharma-Mittal's measure is however only one of two possible extensions that 
unify Renyi and Tsallis entropy in a coherent picture and described how it comes naturally into being 
together with another " supra-extensive" measure if the formalism of generalized logarithm and exponen- 
tial functions is used. Moreover, we could see how the relation between these information measures is 
best understood when described in terms of a logarithmic Kolmogorov-Nagumo average. 

In this paper we shall further investigate in particular the power of the deformed logarithm-exponential 
formalism with regards to the relationship of generalized entropy measures and Fisher information. 

Fisher information was originally conceived in the 1920s (4), many years before Shannon's notion of 
entropy, as a tool of statistical inference in parameter estimation theory. It must be emphasized that 
Fisher's functional is an information, but not an entropy measure. There is nevertheless a strong connec- 
tion between Fisher information and entropy. This relationship has been outlined in many occasions since 
Rao (jilh . already in 1945, laid the foundations of statistical differential geometry, called also information 
geometry (for a more recent review of the subject see e.g. Amari & Nagaoka (1)). Rao outlined how a 
statistical model can be described by a statistical differential manifold which can be considered as a Rie- 
mannian manifold of parametrized probability distributions (PD) or probability density functions (PDF) 
with the metric tensor given by the Fisher information matrix (FIM). The FIM determines a Riemannian 
information metric on this parameter space, and is therefore called also the Fisher metric. This has been 
the subject of renewed interest more recently also in other branches of information theory, in applications 
of image processing, econometrics and received some attention in theoretical physics, especially in regards 
to its, still not entirely understood, role in quantum mechanics and perhaps also quantum gravity (see 
e.g. B.R. Frieden's work (Q) which tries to derive the laws of physics from a Fisherian point of view, or 
R. Carol's review Q) of some other similar attempts and references therein). 

Less has been done to highlight the links between Fisher information and generalized measures and 
non-extensive statistics. Some attempts in this direction were made for instance by F. Pennini and A. 
Plastino (@), M. Portesi, F. Pennini and A. Pennini (0), S. Abe H), J. Naudts (11), ^)), P. Jizba 
(jilt ), just to mention some examples. However, we feel that a clear exposition is lacking about the 
place that the Fisher information measure has in the frame of a generalized statistics. The aim of this 
paper is to highlight in a synthetic way the relationship that exists between Fisher information and the 
two-parametric generalized entropy measures here mentioned (Renyi, Tsallis, Sharma- Mittal and the 
supra-extensive measure which expands further the picture as a consequence of the q-deformed formalism) 
in the sense that diagram of page 14 illustrates, what role the two parameters play in evaluating the Fisher 
information matrix, and how it can be retrieved using a deformed exponential formalism. We will focus 
our attention on how precisely Fisher information (except a real multiplicative factor) emerges naturally 
as a universal statistical metric tensor for every generalized information-entropy measure defined on a 
manifold of PDFs (i.e. for a continuous version of the above mentioned entropies), and to obtain in a 
relatively simple manner this result using a representation based on generalized logarithm and exponential 
functions within the frame of a KN formalism. 

It should also be mentioned that Renyi entropy is not Lesche-stable (fl3). isn't convex and does not 
possess the property of finite entropy production. Therefore any extension of Renyi's entropy, cannot 
in general possess these properties either. There is some controversy if this is supposed to have its 
thermodynamical implications, or not. However, the theoretical framework we are going to construct 
here has to be intended in a more general context, it can still have its meaning and applications in 
information theory, cybernetics or other fields not necessarily restricted to a generalized thermostatistics. 
It is with this point of view in mind that we will proceed. 
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2 The generalized information-entropy measures 



Just to make this paper selfcontained let us briefly sum up the aspects of a generalized information- 
entropy measure theory which will be relevant to the understanding of the next sections. 

2.1 The Boltzmann-Gibbs entropy and Shannon's information measure 

As is well known the Boltzmann-Gibbs (BG) entropy read|l| 

Sbg{P) = ~k ^ftlogpi , 

i 

with Pi the probability of the system to be in the i-th microstate, k the Boltzmann constant. BG entropy 



becomes the celebrated Shannon information measure (|24r ) if k = 1 (as we will do from now on) and uses 



the immaterial base b for the logarithm function (we will maintain the natural logarithm b ~ e) 

ssip) = -Y.p^^^'SbPi = Y.p^^''Sb (-) ^J2p^^''s(-) ■ (2.1) 

i i / i \Pi / 

BG and Shannon's measures are additive, i.e. given two systems, described by two PDs A and B, we 
have 

SsiAnB) = SsiA) + Ss{B\A), 

with Ss{B\A) the conditional entropy. These systems are called extensive systems. This is the case 
where the total entropy behaves as the sum of the entropies of its parts and applies to standard statistical 
mechanics. The additive property is reflected in the logarithm function. 

2.2 TsaUis' entropy 

Nature is however not always a place where additivity is preserved. This is the case of nonlinear complex 
systems, in fractal- or multifractal-like and self-organized critical systems, or where long range forces are 
at work (e.g. in star clusters or in systems with long range microscopic memory), etc. These non-extensive 
systems have been investigated especially in the last two decades ((271' ) 
Tsallis generalized Shannon's entropy to non-extensive systems as (l2f 

STiP,q)= ^f^^ = J-Y^p,(l^pf'), (2.2) 
1 — g 9 — 1 

with q a real parameter. This is now widely known as Tsallis entropy. According to a current school of 
thought at least some non-extensive systems can be described by scaled power law probability functions 
as p', so called q-probabilities. For q — > 1 it reduces to Shannon's measure. Tsallis entropy extends 
additivity to pseudo-additivity 

St{A r\B) = St{A) + St{B\A) + (1 - q)ST{A)ST{B\A) . (2.3) 

In order to describe Tsallis sets the generalized q-logarithm function 

x^^'^ - 1 

turns out to be particularly useful. In a similar way, its inverse, the generalized q- exponential function is 

e^-[l + (l-g)x]T^. (2.5) 

The classical Napier's logarithm and its inverse function is recovered for q = 1. The importance of the 
q-logarithm in this context is realized if we understand that it satisfies precisely a pseudo-additive law 

log, xy = logg X + log^ y + (1 - g)(log, x){\ogq y) . 



^Here we begin to introduce a more general symbolism according to which every type of information measure is labeled 
with Snam<i{{P}, {q}) Or Snam.<ii{'P} , {?}), where P or V stands for the family {pi} of PDs or PDFs and S or S for the 
discrete and continuous cases respectively, while g is a scalar or vector parameter which meaning will become clear in the 
following sections. 
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Exploiting this generalized logarithm and exponential formalism Tsallis entropy 12. 21 can be rewritten 

as 

STiP,q) = -^p?l0g,ft ^E^^'log. (-) ' (2.6) 

which is sometimes also referred to as the q-deformed Shannon entropy. 

Note that log^x" ^ alog^a; when q ^ I. This is the reason why, if one thinks in terms of averages, 
it is more meaningful to write entropy measures with the inverse of the PD, as in the r.h.s. of 12.61 and 
why we will prefer this formal representation. 



2.3 Renyi's entropy 

By looking at the structure of the r.h.s. of 12.11 and 12.61 one can define an information measure as an 
average of the elementary information gains 

=log, f;^) (2.7) 

associated to the i-th event of probability pi 



Pi J \Pi 



55(F) = (log (1)) (2.8) 

and 

STiP)^(logJ^)) (2.9) 



Pi 



lin 



where, what is common to both, is the underlying arithmetic-, or linear mean I = '^^Pih- 

However, A.N. Kolmogorov and M. Nagumo ((fivl). (fioh ) showed, already in 1930 but independently 
from each others that, if we accept Kolomogorov's axioms as the foundation of probability theory, then the 
notion of average can acquire a more general meaning as what is called a quasi- arithmetic or quasi-linear 
mean, and can be defined as 



s = r' i^Pifih. 



(2.10) 



with / a strictly monotonic continuous function, called the Kolmogorov- Nagumo function (KN- function) . 
Renyi instead showed (|2^ that, if additivity is imposed on information measures, then the whole set of 
KN-functions must reduce to only two possible cases. The first is of course the linear mean associated 
with the KN-function 

f{x) = x, 

while the other possibility is the exponential mean represented by the KN-function 

/(x) = ci + C2 ; qeR (2.11) 

with ci and C2 two arbitrary constants. 

Renyi's information- entropy measure is per definition a measure where the single information gains 
are averaged exponentially, and writes 

SR{P,q) = j^\og,Y,pl ^^logY^pl (2.12) 

with b the logarithm base (still we will always assume b — e). When q 1 Renyi's boils down to Shannon 
entropy. 

In fact, if we choose in 12. Ill ci = = —C2, then because of 12.41 it becomes 

f{x)^log,e\ (2.13) 



which inserted in 12. 101 with 

h = log (- 

\Pi 

shows that 12.121 is equivalent to 

5«(P,.)^(log(l 

where stands for an average defined by KN-function 12.131 Compare this with 12.81 and 12.91 
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2.4 The Sharma-Mittal and Supra- extensive entropy 



The next step in the generahzation process consists in finding a measure which is non-extensive and 
non-additive but contains Tsalhs' and Rcnyi's entropies as special cases. One possible way to obtain this 
goes through an extension of the KN-mean. This leads to what is known as the Sharma-Mittal entropy 
(SM) ((25I). However it is only by exploiting the generalized logarithm and exponential representation one 
retrieves in a compact and fast manner both SM entropy measure and what we used to call the "supra- 
extensive" (SE) entropy. By using the q-deformed logarithm and exponential formalism one could easily 
arrive at a further generalization of Renyi and Tsallis entropies. 

The starting point is the relationship between Tsallis and Renyi entropies 



SRiP,q) 



1 



log[l + {l-q) St iP,q)] 



From [T^ and [T5l we see that this is equivalent to 

=logef f^'^), 

and therefore 

12.141 and 12.151 suggest immediately two further generalization: 



^5A/(P,{<?,r})=log,ef-(^^') = ^ 



1-9 



- 1 



and 



SsEiP,{q,r}) = logged 



(1-9) 



- 1 



1 



(2.14) 
(2.15) 

(2.16) 
(2.17) 



with r another real parameter. 

12.161 is SM's pseudo-additive measure, while [TT7I is a new type of entropy measure we called "supra- 
extensive" because it generalizes to a measure which is neither additive nor pseudo-additive. We could 
see (flih how the decisive difference between these two information-entropies is that SM's measure can 
be obtained also through the KN-mean as a two parameter extension of 12.131 (with f{x) = logqcf on 

= log^ (p"))' ^^'^ measure does not have such kind of generalization. It can also be shown 

that for two systems A and B for Sharma-Mittal entropy (instead of I2.3P one has 

Ssm{A nB) = Ssm{A) + Ssm{B\A) + (1 - r)SsM{A)SsM{B\A) . 

This indicates that it is the magnitude of parameter r which stands for the degree of non-extensivity, and 
q stands for a PD deformation parameter. When r q the deformation parameter q of the PD merges 
into the non-extensivity parameter r (which is the reason why in Tsallis entropy it is q instead of r that 
appears for the non-extensive character of the system) . 

The supra-extensive entropy |2 . 1 71 however emerges naturally as a symmetric counterpart of l2.16l when 
generalized logarithms and exponentials are used. Further mathematical-physical investigations which 
will clarify the standpoint of the supra-extensive entropy, what kind of statistics it expresses, if any, and 
its relationship with other measures, is of course desirable and still necessary. Anyway, something can 
be already said. What we are going to do here is that we can show how this new entropy also shares a 
common status in regards to Fisher information with all the other measures too. 



2.5 The multiphcity 

To introduce ourselves to this, note first of all how we can rewrite the quantity 



(log. 



lin 



5 



1 

Pi I log 



gSHP,«)^(±) (2.18) 



where we used what we cah the logarithmic mean (■ defined by the KN-function /(a;) = loQqX. 
Then using 12.51 equations 12.141 to 12.171 can be rewritten as 

5t(P, q) = log^ = log, r!(P, q) ■ (2.19) 



Pi 


/log. 












^log. 


>r ^ 






>A /log. 






log' 









5,j(P, g) = log ( - ) = logf^(P, q) ■ (2.20) 
^SM(P,{g,r}) = logV-\ =log,f](P,g); (2.21) 



^S£(P,{<Z,r})=log,e. =log,e;°sf^(^''?). (2.22) 

Rewriting things in the language of this representation and using the KN logarithmic mean one can 
see more straightforwardly how Sharma-Mittal's entropy generalizes Renyi's extensive entropy to non- 
extensivity, and how the new measure does the same for non-extensivity generalizing it to a 'generalized 
non-extensivity', we called supra- extensivity. 

The quantity 

is well known to have a physical interpretation in statistical mechanics: the multiplicity of the system, 
i.e. the number of all possible microstates compatible with its macroscopic state. 



3 Generalizing to relative entropy-information measures 

S. KuUback and R. A. Leibler (15) introduced the notion of relative entropy. 

Given a random variable X with x a specific (scalar or vector) value for X on a continuous event 
space, consider continuous differentiable PDFs, p{x,6) g C^, with 6 a (scalar or vector) parameter. 
Let be Hi the hypotheses that X is from the statistical population with PDF pi (x, 9) and H2 that 
with PDF p2(x, <^). Then it can be shown |l^ that applying Bayes' theorem, log p^l^'^j Pleasures the 
difference between the logarithm of the odds in favor of Hi against H2 before a measurement gave X = x. 
Kullback's relative entropy, or our "mean capacity for discrimination" in favor of Hi against H2, was 
originally defined as 

£kl=£kl{{puP2})^ f pi(x,0)log^i^'^'^' 

with Ssp the entire sample space. 

If P2(x, (/)) — 1 (we "discriminate" against certainty), the negative Shannon information (in its con- 
tinuous form) is recovered. The different signature is due to the fact that Shannon's information, as all 
the measures we are dealing with here, account for the amount of information we still need to gain com- 
plete knowledge, i.e. the uncertainty about the message. Let us therefore call Kullback- Leibler relative 
information- entropy measure, or simply Kullback's measure 

Skl^Skl{{pi,P2})^ f pi{^,e)\og?^p^d^. (3.1) 

Relative entropies can be used to generalize all information measures either in their continuous as in 
their discrete version. Let us start first with discrete PDs. 

Given two families of PDs P = {p(i);P(2)} = [pf^ ; pj}^ {ij = (l,...,f7)), Kullback's measureO 
takes the form 



6 



Then, in a more general context, we can extend I^Tfl to elementary relative information gains as 



log 



Pi 



or 



m 



log. 



^ 

,(1) 



(for extensive systems) , 



(for non — extensive systems) 



with s = g or s = r for Tsallis' and SM's entropies respectively, that is we can rewrite [ 



Iwith all 



KN means so far considered again generalizing it to relative information gains, and then replace the so 
obtained relative Renyi entropy in the exponential expression of 12.221 (or, proceeding in a somewhat less 



rigorous manner, simply extend ^ — s- in all of them) 



STiP,q)= log. 



SR{P,q) = log i 



SsM{P,{q,r}) = log^ 



(2) ' 
±_ 

.(1) 



log. 



,(1) 



log. 



,(1) 



log. 



1-r 



- 1 



los 



SsE{P,{q,r}) = log er 



l-q 



(3.3) 
(3.4) 

(3.5) 
(3.6) 



For 1 they reduce to l2.2| [?.12l I2.16 land l2.17l respectivelv. while for q = 1 Tsallis' and Renyi's 

measures 13.31 and 13.41 become both KuUback's measure 13.21 From [3751 p.6p we recover Renyi's (Tsallis') 
measure 13.41 (13. 3p . if r — > 1, and Tsallis (Renyi's) measure 13.31 (|3.4p . \i r ^ q. Notice how it is much 
easier to recognize the limits in the logarithmic-exponential representation. 

Straightforwardly we can now extend to continuous PDFs over parameter spaces 9 and (p. The contin- 
uous Tsallis, Renyi, Sharma-Mittal and supra- extensive relative information- entropy measures become 



ST{V,q) = 



ypi(x,0)«p2(x,0)l-«d"x-l 

SR{r,q) = -^log /pi(x,0)>2(x,0)l-«rf"x; 
1-9 J 



SsM{V,{q,r}) 



SsE{V,{q,r})^ 



1 



1 -r 
(1-0 



Pi 



(X,0)>2(X, 0)l-«d"x 



l + |f5^1og/(pi(x,0))nP2(x,0))l-' 



l-q 



(3.7) 
(3.8) 

(3.9) 
(3.10) 



Of course, one could again rewrite things all over again, to see that the same result appears if we 
extend the Kolmogorov-Nagumo mean to continuity as 



s = f-^ (^Jpii^,e)f{i,{^,e,<p))d"x 



(3.11) 
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where 

0, (/)) — log I ^-^7—^-^ I for extensive systems ; 

or 

Xa;(x, 9, (j)) — logo I ^~7~^^ I foi' ^'^^ ~ extensive systems , 
VPi(x,6')y 

and/or using the generalized q-deformed logarithm and exponential expressions from 12.1^ to 12.221 
extending ^ ^ . 

Then, applying 13.111 (/ = log^ x) to obtain the relative and continuous extension of multiplicity 12.51 
one hasG 

m<^) = = ei-^-^^^-'^^^^" = . (3.12) 



log. 



Then we can rewrite [5771 to [5TTU1 in its relative continuous extension of l2.19l to 12.221 as 

5T(0,0)-log7^^\ ^\ogMe,4>); (3.13) 



«\pi(x,0) 
P2(a;, 0) \ 



SR{eA) = iog{^-^^-^) =\ogn{e,<p)- (3.i4) 

Ll2;,t') / 1 



55A/(0, </>) = log, ('^^ ) = log, n{9, </>) ; (3.15) 



Pi{x,e) 



log 



log/ £2152*1 \ 

5^,^(0, 0) = log^ ^ = log, e^°«^(^^^) . (3.16) 



4 The role of Fisher information for generaUzed entropy mea- 
sures 

4.1 The Fisher information measure 

We are now ready to proceed towards the real aim of this paper. We begin with a brief introduction to 
Fisher information. 

In 1921, R. Arnold Fisher defined an information measure which could account for the "qual- 
ity" or "efficiency" of a measurement. Calling efficient estimator or best estimator^ the best unbi- 
ased estimate 9{x) of 9 after many independent measurements on a random variable x such that 
< 9{x)>— J p(x,9) 9(x) dx = 9, Fisher defined the efficiency or quality of a measurement, Ip, the 
quantity which satisfies 

where = / p{x, 9)[9{x) — 9]"^ dx is the mean square error. 
Fisher showed ([j) that then Xp is uniquely identified as 

\ / lin 

For any other estimator one chooses, the Cramer-Rao inequality, or Cramer-Rao bound, holds 

Ipe^ >1. 

^Since we will work with parameters, let us write for a lighter notation on the multiplicity and the entropies, f2('P, {q}) = 
Q{e, 41) and S(V, {q, r}) = 3(6, </.). 
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Going over to N-dimensional vector random variables x = (xi, ...,xn) on an M-dimensional param- 
eter space 9 — {9i, ...,6m), Fisher defined its celebrated (symmetric) Fisher information matrix which 
elements are given by 

T? m\ 1 9^ogp{:ic,0) 91ogp(x, 



1 9p(x,g) 9p(x,0) 
5,^p(x,0) de, dB, ^^-'^ 

with = 1, ..,M). If we would further extend to an L-dimensional continuous probability space 
V — {pi, ...,pl), then the most general expression for Fisher information writes 

L M 

4.2 The Fisher information matrix as a metric tensor 

We will not go into the details in what would be a much too long exposition of information geometry and 
shall highlight only in an introductory manner the status of the FIM as a metric tensor for a statistical 
manifold (for a more rigorous account of the subject see e.g. ([l'), ('s'), ('28'), IIS), and references therein). 

Consider a family of diffcrcntiablc PDFs with N-dimensional continuous vector random variables 
X, parametrized by an M-dimensional continuous real vector parameter space 9 on an open interval 
le C R^^ 

Tg^{p{^,e)eC^;eeIe}. 

The notion of a differential statistical manifold is identified in the fact that the parameters 9 can be 
conceived as providing a local coordinate system for an M-dimensional manifold M which points are in 
a one to one correspondence with the distributions p ^ Tg. 

Since information-entropy measures are log-probability functionals defined on A^, it is convenient to 
consider also the function Z : ^ M on the manifold M defined as l{9) = logp(x, 6*). This is com- 
monly called the log-likelihood function. Labelling the manifold's tangent space T0{M.), the directional 
derivatives of l{9) along the tangent vectors G Tg{Jv[) at a point in J\A with coordinates 9 are (use the 
shorthand di = -J-): dil{9) e, = . 

FIM 14.11 can also be seen as the expectation value with respect to p(x, 9) of the partial derivatives of 
I (9), which is the reason why in the literature it is frequently written as 

F,,{9)^E[dd{9)djm]- 

This is a symmetric, non-degenerate, bilinear form on a vector space of random variables dil{9). But 
a Riemannian metric g is per definition a symmetric non-degenerate inner product on the manifold's 
tangent space Tg{M), and one can therefore consider the FIM as the statistical analogue of the metric 
tensor for a statistical manifold. 

By the way, it is worth mentioning that Corcuera & Giummol'e showed (0) that the FIM has also 
the unique properties of being covariant under reparametrization of the paramet er sp ace of the manifold, 
and invariant under reparametrization of the sample space (see also Wagenaar (j28l ) for a review) . This 
is an appealing aspect which possibly suggests that Fisher information might play some role in future 
quantum spacetime theories. 

Now, the metric tensor tells how to compute the distance between any two points in a given space. 
Here we are considering the distance between two points on a statistical differential manifold mapped on a 
measure functional, i.e. the informational difference between them. This idea can be introduced regarding 
Kullback's relative information measure to account for the net dissimilarity between two families of PDFs 
with parameters, 9 and (j). Intuitively one can imagine this as measuring a "distance" between these two 
families. However, strictly speaking, this is not a metric distance because it is neither symmetric nor 
satisfies the triangle inequality (on statistical manifolds one has to consider an extended version of 
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Pythagora's law). The symmetry condition however can be restored if instead of the single information 
measure we use the divergence V of two PDFs, pi and p2, defined asH 



5(^1,^2) + S{p2,Pl) 



If we choose to set 

pi(x,0) =p(x,0); p2(x,0) =p(x,0 + d0), (4.2) 

then the symmetric divergence V{p (x, 0),p{k,9 + dO)) = T>(9^ 9 + d9) can be intended as an extension 
of the square of the Riemannian distance between two nearby distributions. Expanded to second order 
it gives 



V{9,9 + d9) 



-y 

2! ^ 



^d^V{9, 



89,89, 



d9,d9i 



0{de^) . 



because 0) is minimal aX <j) = 9 and the first order vanishes. It is the second order, not the first, 
which is the leading one in every information measure divergence, and it can be shown (([l|), (Q), IH)) 
that it is the second derivative of the divergence which defines the metric, i.e. 



[52^0)1 


1 


89id9j 


d>=e ^ L 



32(5(0,0) +5(0,0)) 



89,89, 



(4.3) 



In case of Kullback's measure 13.11 the divergence is defined as 

-Dkl (0, cty)^\j [p(x, 9) ~ p(x, 0)] log d'bc . 

From 1^31 and keeping in mind that if we want the normalization condition to hold for every 9 implies 



we have 



1 8p{x, 9) 8p{x, 



(4.4) 



p(x,( 



89, 



89, 



d"x = -F„{9), 



which is the (i,j)-th element of the negative FIM 14.11 

This is a very important and known result from information geometry. It is in this sense that gtj can 
be seen as a metric tensor which measures a "distance" on a statistical manifold in a Riemannian space. 
In this sense Fisher information can be said to be a sort of " mother information measure" . 



5 The Fisher metric for generaUzed information- entropy mea- 
sures 

We can generalize this result of information geometry. The Fisher metric for Tsallis, Renyi, the Sharma- 
Mittal and the supra-extensive measures can be obtained considering the relative entropy measures as 
defined in 13.71 13.81 13.91 and 13.101 respectively (with pi = p{x,9), p2 — p(x, 0)), from their respective 
symmetric divergence 

^[0, 4>) = ^ , 

defined on !Fg. 

What we need is the evaluation of 14.31 for each information measure 13.71 to [3.101 One can of course 
compute directly the (somewhat fuzzy) second derivatives gg each time (and for each 9 (j) 
parameter exchange). However, the q-deformed generalized logarithm and exponential formalism and 
the KN-representation make this task easier since it needs only the evaluation of Tsallis' entropy, the 
rest follows almost automatically. 

The final result will be that g^^ remains still the fundamental quantity, but for these more gen- 
eral (supra-extensive, Sharma-Mittal, Renyi and Tsallis) relative entropies the statistical metric tensor 

^The notion of divergence in information geometry can be established in a rigorous way and is much more general. We 
shall however use only this particular type of definition because it is sufficient for our purposes. 
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{gfj^ , gfj^^ , gfj and gfj respectively) turns out to be only slightly extended by a scalar multiplicative 
q-deforming factor as 



gf/'io) 



'{0) = gBe) = gjAO) = qgf^^e) = -qF,,{9) . 



(5.1) 



This shows also that while gtj depends from the q-dcforming parameter it is independent from the 
r-extensivity parameter. This is quite natural since Fisher information accounts for the "quality" of a 
measure, or so to say, our "differential capacity to distinguish" locally between two neighboring PDFs, 
and this in turn depends from the "form" of the PDF (the q-scaling), but is independent from the 
extensive, non-extensive or supra-extensive character, since these are global features of the system. We 
shall see how it is the normalization condition imposed on PDFs that leads to this independency (and 
recover the known fact that this is also the same reason why gij is symmetric). Moreover, it will also 
become clear how Fisher information measures the rate of change of the multiplicity under a parameter 
variation. 



5.1 Fisher from Tsallis information 

First of all consider the derivation rules 

dloggX _ 



dx ^ «^ 



(5.2) 



dx xi ' dx 

Writing Tsallis' continuous relative entropy [X71 in the q-deformed Shannon notation of l'i.Gl we have 

'P2(X, ( 



St 



pi(x, 9) logg 



pi(x,0) 



(5.3) 



We must be careful in remembering that in general the entropy measures considered are not symmetric 
and have to consider also 

STi^,9) = J P2(x,</.)log, (Sl^) 
Then, applying the q-logarithm derivation rule 15.21 one obtains for the first case (as before ^ = 



df, 



d^j) 



and 



d,Sj 



dijSj 



log, 



= -q 



P2(x,( 

'^.Pi(x,( 

P2(x, (/) 



P2(.X,( 

Pi(x,( 



1-9 



a,pi(x,0)d'bc, 



(5.4) 



l-q 



1 



j3i(x, 6*)/ pi(x, 

" / P2(X,0) \ _ fp^Ml'' 

_ ^«Ui(x,e)y Ui(x,e) 

While in the second case one has quite different derivatives 

'_P2(x, 0) V 



d^pl{x, 9)djpi{x,9)d"x 

5ypi(x,6l)d"x. 



9.5t(<^,0) 



pi(x,0) 



a,pi(x,0)d"x. 



(5.5) 



and 



P2(.X,( 

Pi(x, ( 



9y Pi(x, 9) d"x. 



Note that these derivatives are not the same that one would obtain directly from 13.71 because in 
that case one assumes implicitly the normalization condition satisfied a priori. 13.71 and 15.31 are numer- 
ically identical only for a normalized PDF. The logarithmic-exponential representation, as in the latter 
case, does therefore not only represent a more general expression but, highlights better where and with 
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what effects the normahzation enters into the play. Restricting to PDFs as 14.21 then, because of the 
normalization condition 14.41 from 15.41 and 15.51 one obtains 



while remembering the expression for the FIM 14.11 

1 



= 0, 



(5.6) 



= -9 



p(x,0) 



c),p(x, 9)8 jp{^, 9) d'bc = ~qF,,{9) 



which, through [43l gives us finally gfj — —qFij{9). 

So, since the FIM is symmetric, by the way, we see that in this case, and as we shall see also in all 
the others, it is in particular the normalization condition which renders the statistical metric tensor gij 
symmetric. 

5.2 Fisher from Renyi information 

Evaluating Tsallis' derivatives is indispensable but, once established, we don't need to make any direct 
derivative anymore for all the other measures if we work with generalized logarithms and exponentials. 
We don't even need to repeat the derivation for the symmetry considerations. 

In fact. 14.31 for Renyi's measure can be obtained from l2.lil From we obtain (the arguments (x, 6*) 
or (x, 0) of the measures, the PDFs or of the FIM, shall be omitted if it is not needed otherwise) 



diSR ^d^logef = [efY 'o^St, 



(5.7) 



and 



a, 5 



ijOR 



a„5T + (<z-l)(ef 'd^STd.Sr 



Since [iSt]0=6i = 0, applying the normalization condition (i.e. because of I5.6p . we have 

which leads us to state gfj = gfj = —qFij. 

5.3 Fisher from Sharma-Mittal information 

Use Sharma-Mittal entropy as given in 12.161 and proceed as in the previous case 

d.SsM = d, log, ef = (ef d,ST , 

and 



(5.8) 



dijSr + (q-r) (e 



And again because of 15.6 



we have again gfj^^ = dlj = —qFij. Note that it is the normalization condition, forcing the r.h.s. 
derivatives to vanish, which leads to the independency of gij from the non-extensivity parameter r. 



5.4 Fisher from supra-extensive information 
From [TT7I we get 

^^SsE = \ogg ef« = (ef^)"^' d.Sn , 

and 

^^,SsE - {ef^'Y^' [^^JSR + ir-q) {ef'^Y^' O^Sr O.Sr 
Because of l5.6l and 15.71 

[^^SRi9,^)h, = [^.SR{^,9)h,^0, 
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then, remembering 15 . 8l one has [dijSsE]j,^g = [9ijSii],g — [dijST]j^^n, and finally 



9ij - 9ij - -<l^ij 



Therefore, either dijSsM as dijSsE don't depend from the r parameter because of the normalization 
condition. 

5.5 Working with the multiplicity 

Just for didactics, in order to show how the generalized exponential-logarithmic formalism combined with 
the KN expressions can be used, we reach the same conclusion from the perspective of the entropies as 
a measure of multiplicity. From 13.121 one has 

while 

^^J f2 = 17 « [a^ + g 17 ^ d, Srdj St] , 

which implies that 

= {(^iJ^Tj^^g = -qFij ■ 

Therefore, working with information-entropy measures expressed with the multiplicity as in 13.131 to 
13.161 the SM's measure second derivative is 

one has 

[dijSsM]^^g = -qFij , 

and the above results for Tsallis, Renyi and Shannon's measure all follow again as special cases. 
Finally, for SE measure 



d.jSsE^d, (^{e'^^'^y 



and, as was to expect, the final results simplifies to 

[dtjSsE]^^g = [di-iil]4,=e = -qFij . 

Therefore, since the second order of the multiplicity is the leading one, we can say that Fisher 
information accounts (times a negative parameter multiplicative deformation factor) for the change of 
multiplicity (the change of number of microstates of a system) under a statistical parameter variation. 
This is another way to interpret the fundamental connection between Fisher information and entropy 
measures. 



6 Conclusion 

Using the notion of KuUback-Leibler's relative entropy, generalizing it to all entropies, we showed, as it was 
already known for KuUback's measure, that once again the FIM appears as the same statistical metric 
tensor 15.11 for Tsallis, Renyi, Sharma-Mittal and the supra-extensive measures too. The differential- 
geometric properties of the divergence for each measure are independent from the extensive, non-extensive 
or supra-extensive character of the system, but depend only from the q-deforming parameter. This 
independency and the symmetry of gij are guaranteed by the normalization condition. We could also see 
how Fisher information has to be interpreted as a quantity proportional to the change of the information 
multiplicity under the statistical parameter variation. Generally, the derivation of Fisher information 
proved to be easier to obtain by exploiting the q-deformed logarithm and exponential formalism or the 
KN-representation of information-entropy measures. The overall global picture of the generalization 
process we have undertaken so far can be finally summarized in the diagram of the following page (where 
Pi and p2 can be both PDs or PDFs). 
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Hierarchy of generalized relative entropy measures 



Fisher Matrix 

/ d\ogp dlogp' 

■9\ ae, di, 
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